Using an ASR database to design a pronunciation evaluation system in Basque
نویسندگان
چکیده
This paper presents a method to build CAPT systems for under resourced languages, as Basque, using a general purpose ASR speech database. More precisely, the proposed method consists in automatically determine the threshold of GOP (Goodness Of Pronunciation) scores, which have been used as pronunciation scores in phone-level. Two score distributions have been obtained for each phoneme corresponding to its correct and incorrect pronunciations. The distribution of the scores for erroneous pronunciation has been calculated inserting controlled errors in the dictionary, so that each changed phoneme has been randomly replaced by a phoneme from the same group. These groups have been obtained by means of a phonetic clustering performed using regression trees. After obtaining both distributions, the EER (Equal Error Rate) of each distribution pair has been calculated and used as a decision threshold for each phoneme. The results show that this method is useful when there is no database specifically designed for CAPT systems, although it is not as accurate as those specifically designed for this purpose.
منابع مشابه
Phonetic and Prosodic Aspects in the Cross-lingual Pronunciation Tutoring
Computer-assisted pronunciation tutoring (CAPT) methods have been well-established in research and education. Common system approaches include the phonetic quality assessment, highlight problematic sections in the speech signal and usually rely on automatic speech recognition (ASR) regarding the target language L2. The contribution deals with the audiovisual CAPT system AzAR. An extensive feedb...
متن کاملEmpathy as a Predictor of Pronunciation Mastery: The Case of Female Iranian EFL Learners’ Pronunciation Errors
The present study set out to identify the problematic areas of pronunciation among Iranian female EFL learners. Further, this study investigated the relationship between empathy and authentic pronunciation, along with gender as a moderator variable. Comparing segmental features and phonological processes of both languages helped teachers to predict the target errors. To reach such a goa...
متن کاملMalay Grapheme to Phoneme Tool for Automatic Speech Recognition
This paper presents the design and performance of a Malay grapheme to phoneme (G2P) tool for generating the pronunciation dictionary for a Malay automatic speech recognition system (ASR). The G2P tool is a rule based system. It is flexible in adding and removing rules, and handling of English words. The G2P tool also contains morphological and syllable tool, which it uses to determine the pronu...
متن کاملBasque Speecon-like and Basque SpeechDat MDB-600: speech databases for the development of ASR technology for Basque
This paper introduces two databases specifically designed for the development of ASR technology for the Basque language: the Basque Speecon-like database and the Basque SpeechDat MDB-600 database. The former was recorded in an office environment according to the Speecon specifications, whereas the later was recorded through mobile telephones according to the SpeechDat specifications. Both datab...
متن کاملDevelopment of a Computer-Aided Language Learning System for Mandarin – Tone Recognition and Pronunciation Error Detection
This paper reports on the continued activities towards the development of a computer-aided language learning system for teaching Mandarin to Germans. A method for f0 normalization based on maximum likelihood estimation and tone recognition was implemented. Furthermore, a method for detecting the pronunciation errors was tested by calculating the confidence distance between the first and second ...
متن کامل